On Time Optimal Supernode Shape

نویسندگان

Edin Hodzic

Weijia Shang

چکیده

With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses the selection of an optimal supernode shape of a supernode transformation (also known as tiling). We assume that the communication cost is dominated by the startup penalty and therefore, can be approximated by a constant. We identify three parameters of a supernode transformation: supernode size, relative side lengths, and cutting hyperplane directions. For algorithms with perfectly nested loops and uniform dependencies , we give a closed form expression for an optimal linear schedule vector, and a necessary and suucient condition for optimal relative side lengths. We prove that the total running time is minimized by cutting hyperplane direction matrix whose rows are from the surface of the polar cone of the cone spanned by dependence vectors, also known as tiling cone. The results are derived in continuous space and should for that reason be considered approximate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Optimal Size and Shape of Supernode Transformations

| Supernode transformation has been proposed to reduce the communication startup cost by grouping a number of iterations in a perfectly nested loop with uniform dependencies as a supern-ode which is assigned to a processor as a single unit. A supernode transformation is speciied by n families of hyperplanes which slice the iteration space into parallelepiped supernodes, the grain size of a supe...

متن کامل

On Supernode Transformation with Minimized Total Running Time

With the objective of minimizing the total execution time of a parallel program on a distributed memory parallel computer, this paper discusses how to nd an optimal supernode size and optimal supernode relative side lengths of a supernode transformation (also known as tiling). We identify three parameters of supernode transformation: supernode size, relative side lengths, and cutting hyperplane...

متن کامل

Expediating IP lookups with reduced power via TBM and SST supernode caching

0140-3664/$ see front matter 2009 Elsevier B.V. A doi:10.1016/j.comcom.2009.10.006 * Corresponding author. E-mail addresses: [email protected] (Y. Zhang) [email protected] (W. Lu), [email protected] (L. Duan), s In this paper, we propose a novel supernode caching scheme to reduce IP lookup latencies and energy consumption in network processors. In stead of using an expensive TCAM based scheme, we imp...

متن کامل

Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization

Interior-point methods are among the most efficient approaches for solving large-scale nonlinear programming problems. At the core of these methods, highly ill-conditioned symmetric saddle-point problems have to be solved. We present combinatorial methods to preprocess these matrices in order to establish more favorable numerical properties for the subsequent factorization. Our approach is base...

متن کامل

Data Parallel Code Generation for Arbitrarily Tiled Loop Nests

Tiling or supernode transformation is extensively discussed as a loop transformation to efficiently execute nested loops onto distributed memory machines. In addition, a lot of work has been done concerning the selection of a communication-minimal and a scheduling-optimal tiling transformation. However, no complete approach has been presented in terms of implementation for non-rectangularly til...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

IEEE Trans. Parallel Distrib. Syst.

دوره 13 شماره

صفحات -

تاریخ انتشار 1999

On Time Optimal Supernode Shape

نویسندگان

چکیده

منابع مشابه

On Optimal Size and Shape of Supernode Transformations

On Supernode Transformation with Minimized Total Running Time

Expediating IP lookups with reduced power via TBM and SST supernode caching

Matching-based preprocessing algorithms to the solution of saddle-point problems in large-scale nonconvex interior-point optimization

Data Parallel Code Generation for Arbitrarily Tiled Loop Nests

عنوان ژورنال:

اشتراک گذاری